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Abstract. We introduce a novel strategy for constructing symmetric positive definite (SPD) 
preconditioners for linear systems with symmetric indefinite matrices. The strategy, called absolute 
value preconditioning, is motivated by the observation that the preconditioned minimal residual 
method with the inverse of the absolute value of the matrix as a preconditioner converges to the 
exact solution of the system in at most two steps. Neither the exact absolute value of the matrix 
nor its exact inverse are computationally feasible to construct in general. However, we provide 
a practical example of an SPD preconditioner that is based on the suggested approach. In this 
example we consider a model problem with a shifted discrete negative Laplacian, and suggest a 
geometric multigrid (MG) preconditioner, where the inverse of the matrix absolute value appears 
only on the coarse grid, while operations on finer grids are based on the Laplacian. Our numerical 
tests demonstrate practical effectiveness of the new MG preconditioner, which leads to a robust 
iterative scheme with minimalist memory requirements. 

Key words. Preconditioning, linear system, preconditioned minimal residual method, polar 
decomposition, matrix absolute value, multigrid, polynomial filtering 

AMS subject classifications. 15A06, 65F08, 65F10, 65N22, 65N55 

1. Introduction. Large, sparse, symmetric, and indefinite systems arise in a va- 
riety of applications. For example, in the form of saddle point problems, such systems 
result from mixed finite element discretizations of underlying differential equations of 
fluid and solid mechanics; see, e.g., [3] and references therein. In acoustics, large sparse 
symmetric indefinite systems are obtained after discretizing the Helmholtz equation 
for certain media types and boundary conditions. Often the need to solve symmetric 
indefinite problems comes as an auxiliary task within other computational routines, 
such as the inner step in interior point methods in linear and nonlinear optimiza- 
tion [3, 26], or solution of the correction equation in the Jacobi-Davidson method [36] 
for a symmetric eigenvalue problem. 

We consider an iterative solution of a linear system Ax = b, where the matrix 
A is real nonsingular and symmetric indefinite, i.e., the spectrum of A contains both 
positive and negative eigenvalues. In order to improve the convergence, we intro- 
duce a preconditioner T and formally replace Ax = 6 by the preconditioned system 
TAx — Tb. If T is properly chosen, an iterative method for this system can exhibit 
a better convergence behavior compared to a scheme applied to Ax = b. Neither the 
preconditioner T nor the preconditioned matrix TA is normally explicitly computed. 

If T is not symmetric positive definite (SPD), then TA^ in general, is not symmet- 
ric with respect to any inner product [29, Theorem 15.2.1]. Thus, the introduction 
of a non-SPD preconditioner replaces the original symmetric problem Ax = & by a 
generally nonsymmetric TAx ~ Tb. Specialized methods for symmetric linear sys- 
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terns are no longer applicable to the preconditioned problem, and must be replaced by 
iterative schemes for nonsymmetric linear systems; e.g., GMRES or GMRES(m) [35], 
Bi-CGSTAB [41], and QMR [17]. 

The approach based on the choice of a non-SPD preconditioner, which leads to 
solving a nonsymmetric problem, has several disadvantages. First, no short-term re- 
current scheme that delivers an optimal Krylov subspace method is typically available 
for a nonsymmetric linear system [15]. In practice, this means that implementations 
of the optimal methods (e.g., GMRES) require an increasing amount of work and 
storage at every new step, and hence are often computationally expensive. 

Second, the convergence behavior of iterative methods for nonsymmetric linear 
systems is not completely understood. In particular, the convergence may not be 
characterized in terms of reasonably accessible quantities, such as the spectrum of the 
preconditioned matrix; see the corresponding results for GMRES and GMRES (to) 
in [20, 43]. This makes it difficult to predict computational costs. 

If T is chosen to be SPD, i.e., T = T* > 0, then the matrix TA of the precondi- 
tioned linear system is symmetric with respect to the T"^ -inner product defined by 
{u, u)t-i — {u, T^^v) for any pair of vectors u and v. Here (•, •) denotes the Euclidean 
inner product {u,v) = v*u, in which the matrices A and T are symmetric. Due to 
this symmetry preservation, system TAx — Tb can be solved using an optimal Krylov 
subspace method that admits a short-term recurrent implementation, such as precon- 
ditioned MINRES (PMINRES) [14, 28]. Moreover, the convergence of the method 
can be fully estimated in terms of the spectrum of TA. 

In light of the above discussion, the choice of an SPD preconditioner for a sym- 
metric indefinite linear system can be regarded as natural and favorable, especially 
if corresponding non-SPD preconditioning strategies fail to provide convergence in a 
small number of iterations. We advocate the use of SPD preconditioning. 

The question of constructing SPD preconditioners for symmetric indefinite sys- 
tems has been widely studied in many applications. For saddle point problems, the 
block-diagonal SPD preconditioning has been addressed, e.g., in [16, 37, 44]. In [2], 
it was proposed to use an inverse of the negative Laplacian as an SPD preconditioner 
for indefinite Helmholtz problems. This approach was further extended in [25] by 
introducing a shift into the preconditioner. Another strategy was suggested in [18], 
primarily in the context of linear systems arising in optimization. It is based on the 
so-called Bunch-Parlett factorization [9] . 

We introduce here a different idea of constructing SPD preconditioners that re- 
semble the inverse of the absolute value of the coefficient matrix. Throughout, the 
absolute value of A is defined as a matrix function \ A\ = V |A| V* , where A = VAV* is 
the eigenvalue decomposition of A. We are motivated by the observation that PMIN- 
RES with 1^1^^ as a preconditioner converges to the exact solution in at most two 
steps. We refer to the new approach as the absolute value (AV) preconditioning and 
call the corresponding preconditioners the AV preconditioners. 

The direct approach for constructing an AV preconditioner is to approximately 
solve \A\ z = r. However, \A\ is generally not available, which makes the application 
of standard techniques, such as, e.g., incomplete factorizations, approximate inverses, 
problematic. The vector \A\ ^r can also be found using matrix function compu- 
tations, normally fulfilled by a Krylov subspace method [19, 23] or a polynomial 
approximation [30, 31]. Our numerical experience shows that the convergence, with 
respect to the outer iterations, of a linear solver can be significantly improved with 
this approach, but the computational costs of approximating f{A)r = |^| ^ r may be 
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too high, i.e., much higher than the cost of matrix- vector muhiplication with A. 

Introduction of the general concept of the AV preconditioning is the main the- 
oretical contribution of the present work. As a proof of concept example of the AV 
preconditioning, we use a geometric multigrid (MG) framework. To investigate appli- 
cability and practical effectiveness of the proposed idea, we choose a model problem 
resulting from discretization of a shifted Laplacian (Helmholtz operator) on a unit 
square with Dirichlet boundary conditions. The obtained linear system is real sym- 
metric indefinite. We construct an MG AV preconditioner that, used in the PMINRES 
iteration, delivers an efhcient computational scheme. 

Let us remark that the same model problem has been considered in [4] , where the 
authors utilize the coarse grid approximation to reduce the indefinite problem to the 
SPD system. Satisfactory results have been reported for small shifts, i.e., for slightly 
indefinite systems. However, the limitation of the approach lies in the requirement 
on the size of the coarse space, which should be chosen sufficiently large. As we show 
below, the MG AV preconditioner presented in this paper allows keeping the coarsest 
problem reasonably small, even if the shift is large. 

Numerical solution of Helmholtz problems is an object of active research; see, 
e.g., [1, 6, 12, 13, 21, 27, 40]. A typical Helmholtz problem is approximated by a com- 
plex symmetric (non-Hermitian) system. The real symmetric case of the Helmholtz 
equation, considered in this paper, is less common. However, methods for complex 
problems are evidently applicable to our particular real case, which allows us to make 
numerical comparisons with known Helmholtz solvers. 

We test several of solvers, based on the inverted Laplacian and the standard 
MG preconditioning, to compare with the proposed AV preconditioning. In fact, the 
inverted (shifted) Laplacian preconditioning [2, 25] for real Helmholtz problems can 
be viewed as a special case of our AV preconditioning. In contrast to preconditioners 
in [18] relying on the Bunch-Parlett factorization, we show that the AV preconditioners 
can be constructed without any decompositions of the matrix, which is crucial for very 
large or matrix- free problems. 

This paper is organized as follows. In Section 2, we present and justify the gen- 
eral notion of an AV preconditioner. The rest of the paper deals with the question of 
whether AV preconditioners can be efficiently constructed in practice. In Section 3, 
we give a positive answer by constructing an example of a geometric MG AV precon- 
ditioner for the model problem. The efficiency of this preconditioner is demonstrated 
in our numerical tests in Section 4. We conclude in Section 5. 

2. AV preconditioning for symmetric indefinite systems. Given an SPD 
preconditioner T, we consider solving a linear system with the preconditioned minimal 
residual method, implemented in the form of the preconditioned MINRES (PMINRES) 
algorithm [14, 28]. In the absence of round-off errors, at step i, the method constructs 
an approximation a:;^*-' to the solution of Ax — b of the form 



(2.1) 





Here, JC, {TA,Tr^^^) = span {Tr^"), (T^)rr(o), . . . , [TAy-^Tr^^^] is the Krylov sub- 
space generated by the matrix TA and the vector Tr^'^\ the T-norni is defined by 
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\\v\\^ — {v,v)t for any v, and a;*-"-* is the initial guess. Scheme (2.1)-(2.2) represents 
an optimal Krylov subspace method and the PMINRES implementation is based on 
a short-term recurrence. The conventional convergence rate bound for (2.1)-(2.2) can 
be found, e.g., in [14], and relies solely on the distribution of eigenvalues of TA. 

The following trivial, but important, theorem regards |^| as an SPD precondi- 
tioner for a symmetric indefinite system. 

Theorem 2.1. The preconditioned minimal residual method (2.1)-(2.2) with 
preconditioner T — \ A\ converges to the solution of Ax — b in at most two steps. 

Theorem 2.1 implies that T = \ A\ is an ideal SPD preconditioner. Note that the 
theorem holds not only for the preconditioned minimal residual method (2.1)-(2.2), 
but for all methods where convergence is determined by the degree of the minimal 
polynomial of TA. 

In practical situations, the computation of an ideal SPD preconditioner T = | A| ^ 
is prohibitively costly. However, we show that it is possible to construct inexpensive 
SPD preconditioners that resemble |^|~^ and can significantly accelerate the conver- 
gence of an iterative method. 

Definition 2.2. We call an SPD preconditioner T for a symmetric indefinite 
linear system Ax — h an AV preconditioner if it satisfies 

(2.3) 5o{v,T-^v) < {v,\A\v) < 5i{v,T-^v), Wv 

with constants Si > 6q > 0, such that the ratio 6i/Sq > 1 is reasonably small. 

Let us remark that Definition 2.2 of the AV preconditioner is informal because 
no precise assumption is made of how small the ratio 6i/6o should be. It is clear 
from (2.3) that Si/Sq measures how well the preconditioner T approximates |^| , 
up to a positive scaling. If A represents a hierarchy of mesh problems then it is 
desirable that Si/6q is independent of the problem size. In this case, if A is SPD, 
Definition 2.2 of the AV preconditioner is consistent with the well known concept of 
spectrally equivalent preconditioning for SPD systems; see [10]. 

The following theorem provides bounds for eigenvalues of the preconditioned ma- 
trix TA in terms of the spectrum of T [A]. We note that T and A, and thus TA and 
T]j4], do not in general commute. Therefore, our spectral analysis cannot be based 
on a traditional matrix analysis tool, a basis of eigenvectors. 

Theorem 2.3. Given a nonsingular symmetric indefinite A G M"^" and an 
SPD T G M"^", let fj,i < fi2 l£ ■ ■ . l£ /^n be the eigenvalues ofT \ A\. Then eigenvalues 
Ai<...<Ap<0< Xp+i < ■ . • < A„ ofTA are located in intervals 

(•2 4-1 -fJ-n-j+i < < j = l,...,p; 

l^j-p < < fij, j ^p+l,...,n. 

Proof. We start by observing that the absolute value of the Rayleigh quotient of 
the generalized eigenvalue problem Av — X\A\v is bounded by 1, i.e., 

(2.5) \{v,Av)\ < {v,\A\v), Vv G M". 

Now, we recall that the spectra of matrices T\A\ and TA are given by the general- 
ized eigenvalue problems ]A] u = ^T^^v and Av = XT~^v, respectively, and introduce 
the corresponding Rayleigh quotients 



(2.6) 
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Let us fix any index j e {1, 2, . . . , n}, and denote by S an arbitrary subspace of 
M" sucli that dim(5) = j. Since inequality (2.5) also holds on S, using (2.6) we write 

(2.7) -ipiv) <(t){v) <^{v), V € S. 

Moreover, taking the maxima in vectors v £ S, and after that the minima in subspaces 
S E ^ {S C M" : dim(S') = j}, of all parts of (2.7) preserves the inequalities, so 

(2.8) min max(—^(v)) < min Taaxdiv) < min mBx^hiv). 

By the Courant-Fischer theorem (see, e.g., [24, 29]) for the Rayleigh quotients ±4>{v) 
and (j){v) defined in (2.6), we conclude from (2.8) that 

Recalling that j has been arbitrarily chosen, we obtain the following bounds on the 
eigenvalues of TA: 

-^J.n~j+l < Aj < 0, j = l,...,p; 
^ ' ' < Xj < fij, j=p+l,...,n. 

Next, in order to derive nontrivial upper and lower bounds for the p negative and 
n — p positive eigenvalues Xj in (2.9), we use the fact that eigenvalues and Q of the 
generalized eigenvalue problems \A\ = STv and A~^v = C,Tv are the reciprocals 
of the eigenvalues of the problems |^| u = fiT^^v and Av = XT^^v, respectively, i.e., 

(2.10) 0<Ci = — < 6 - — <••• <e„ = —, 
and 

(2.11) Ci = ^ < • • • < Cf> = ^ < < Cp+i = < ■ • ■ < C« = 
Similar to (2.5), 

\{v,A-^v)\ < {v,\A\~^v), VveR". 

Thus, we can use the same arguments as those following (2.5) to show that rela- 
tions (2.7) and (2.8), with a fixed j E {1, 2, . . . , n}, also hold for 

(2.12) ^(.)^(^^f^,0(.)^(^,..M", 

{v,lv) (v,lv) 

where ip{v) and (/>(i') are now the Rayleigh quotients of the generalized eigenvalue 
problems \A\ ^ v = £Tv and A~^v = QTv, respectively. The Courant-Fischer theorem 
for ±-4>{v) and (t){v) in (2.12) allows us to conclude from (2.8) that 

^Cn-J + l < < ^3- 

Given the arbitrary choice of j in the above inequality, by (2.10)-(2.11) we get the 
following bounds on the eigenvalues of TA: 

/2 1Q^ < < 0, j = l,...,p; 

^ ■ > < 1/A, < l/^iJ-p, i=p + l,...,n. 
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Combining (2.9) and (2.13), we obtain (2.4). □ 

Theorem 2.3 suggests two useful implications given by the corresponding corol- 
laries below. In particular, the following result describes A{TA), i.e., the spectrum of 
the preconditioned matrix TA, in terms of 5q and 5i in (2.3). 

Corollary 2.4. Given a nonsingular symmetric indefinite A G M"^", an SPD 
T G M"^", and constants i5i > (5o > satisfying (2.3), we have 

(2.14) k{TA)(z[-6i,-6a\\J[6^,5i], 

where K{TA) is the spectrum ofTA. 

Proof. Follows directly from (2.3) and (2.4) with j = l,p,p+ l,n. □ 

The next corollary shows that the presence of reasonably populated clusters of 
eigenvalues in the spectrum of T \ A\ guarantees the occurrence of corresponding clus- 
ters in the spectrum of the preconditioned matrix TA. 

Corollary 2.5. Given a nonsingular symmetric indefinite A G M"^" and an 
SPD T G M"^", let 111 < fii+i < . . . < fii+k-i be a sequence of k eigenvalues of 
T\A\, where l<l<l + k — l<n and t = — Then, if k > p + 2, 

the k — p positive eigenvalues A/+p < Xi+p+i < ... < A;+fe_i of TA are such that 
\Xi+p — Xi+k-i\ ^ T. Also, if k > (ri — p) + 2, the k — [n — p) negative eigenvalues 
A„-fe-;+2 < • ■ ■ < \-i < Ap_/+i ofTA are such that \Xn-k-i+2 - -^p-;+i| < t. 

Proof. Follows directly from bounds (2.4). □ 

Corollary 2.4 implies that the ratio Si/Sq > 1 of the constants from (2.3) measures 
the quality of the AV preconditioner T. Indeed, the convergence speed of the precon- 
ditioned minimal residual method is determined by the spectrum of TA, primarily 
by the intervals of the right-hand side of inclusion (2.14). Additionally, Corollary 2.5 
prompts that a "good" AV preconditioner should ensure clusters of eigenvalues in the 
spectrum of This implies the clustering of eigenvalues of the preconditioned 

matrix TA, which has a favorable effect on the convergence behavior of a polynomial 
iterative method, such as PMINRES. 

In the next section, we construct an example of the AV preconditioner for a 
particular model problem. We apply the MG techniques. 

3. MG AV preconditioning for a model problem. Let us consider the 
following real boundary value problem, 

(3.1) - Aw(x, y) - c2u(x, y) = /(x, y), (x, y) e il = (0, 1) x (0, 1), u\r = 0, 

where A = /dx^ + /dy^ is the Laplace operator and F denotes the boundary of 
il. Problem (3.1) is a particular instance of the Helmholtz equation with Dirichlet 
boundary conditions, where c > is a wave number. 

After introducing a uniform grid of size h in both directions and using the standard 
5-point finite-difference stencil to discretize continuous problem (3.1), one obtains the 
corresponding discrete problem 

(3.2) {L-c^I)x = b, 

where A = L — c^I represents a discrete negative Laplacian L (later called "Lapla- 
cian"), satisfying the Dirichlet boundary condition shifted by a scalar c^. 
The common rule of thumb, see, e.g., [13, 22], for discretizing (3.1) is 



(3.3) 



ch < n/b. 
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Below, we call (3.2) the model problem. We assume that the shift is different from 
any eigenvalue of the Laplacian and is greater than the smallest but less than the 
largest eigenvalue. Thus, the matrix L — c^I is nonsingular symmetric indefinite. In 
the following subsection, we apply the idea of the AV preconditioning to construct an 
MG AV preconditioner for system (3.2). 

While our main focus throughout the paper is on the 2D problem (3.1), in order 
to simplify presentation of theoretical analysis, we also refer to the ID analogue 

(3.4) - u"(x) - c2w(x) = fix), ii(0) = u(l) = 0. 

The conclusions drawn from (3.4), however, remain qualitatively the same for the 2D 
problem of interest, which we test numerically. 

3.1. Two-grid AV preconditioner. Along with the fine grid of mesh size h 
underlying problem (3.2), let us consider a coarse grid of mesh size H > h. We 
denote the discretization of the Laplacian on this grid by Lh, and Ih represents the 
identity operator of the corresponding dimension. We assume that the exact fine-level 
absolute value \L — c^/| and its inverse are not computable, whereas the inverse of 
the coarse-level operator \Lh — c^Ih\ can be efhciently constructed. In the two-grid 
framework, we use the subscript H to refer to the quantities defined on the coarse 
grid. No subscript is used for denoting the fine grid quantities. 

While \ L — c^l\ is not available, let us assume that we have its SPD approximation 
i.e., B K. \L — c I\ and B — B* > 0. The operator B can be given in the explicit 
matrix form or through the action on a vector. We suggest the following general 
scheme as a two-grid AV preconditioner for model problem (3.2). 

Algorithm 3.1 (The two-grid AV preconditioner). Input: r, B ^ \L — c^I\. 
Output: w. 

1. Presmoothing. Apply v smoothing steps, u >1: 

(3.5) = w^*) -|-M-\r-Bw(*'), i = 0, . . . , - 1, u>(°) = 0, 

where M defines a smoother. Set w^^'^ — w^'^^ . 

2. Coarse grid correction. Restrict (R) r — Bw^^'^ to the coarse grid, apply 
\Lh — c^Ih\ , and prolongate (P) to the fine grid. This delivers the coarse 
grid correction, which is added to wP^'^ : 

(3.6) WH =\Lh -c^Inf^ R{r- BwP"'"), 

(3.7) w'^s^ = wP'''' + Pwh- 

3. Postsmoothing. Apply v smoothing steps: 

(3.8) w^'+^^ = + ]vr*{r - Bw^'l), i = 0, . . . , - 1, w^^^ = w'^^'^, 
where M and v are the same as in step 1. Return w = wP°^^ — w^'^\ 

In (3.6) we assume that \Lh — c^Ih\ is nonsingular, i.e., is different from 
any eigenvalue of Lh- The presmoother is defined by the nonsingular M, while the 
postsmoother is delivered by M* . Note that the (inverted) absolute value appears only 
on the coarse grid, while the fine grid computations are based on the approximation B. 

It is immediately seen that if _B = |L — c^/|. Algorithm 3.1 represents a formal 
two-grid cycle [8, 39] for system 



(3.9) 



8 



EUGENE VECHARYNSKI AND ANDREW V. KNYAZEV 



Note that the introduced scheme is rather general in that different choices of approx- 
imations B and smoothers M lead to different preconditioners. We address these 
choices in more detail in the following subsections. 

It can be verified that the AV preconditioner given by Algorithm 3.1 implicitly 
constructs a mapping r i-^ w = Ttgr, where the operator is 

(3.10) Ttg = {I - M-*By P\Lh -c^Inf^ R{I - BM-^y + F, 

with F = B'^ - {I ~M-*BY B-^ {l~ BM-^y. The fact that the constructed 
preconditioner T — Ttg is SPD follows directly from the observation that the first 
term in (3.10) is SPD provided that P = aR* for some nonzero scalar a, while the 
second term F is SPD if the spectral radii oi I — M~^B and I— M~*B are less than 1. 
The latter condition requires the pre- and postsmoothing iterations (3.5) and (3.8) to 
represent convergent methods for By = r. Note that the above argument essentially 
repeats the one used to justify symmetry and positive definiteness of a preconditioner 
based on the standard two-grid cycle for an SPD system; see, e.g., [5, 38]. 

In this paper we consider two different choices of the approximation B. The 
first choice is given hy B = L, i.e., it is suggested to approximate the absolute value 
\L — c^I\ by the Laplacian L. The second choice is delivered hy B = Pm{L — c^I), 
where Pm is a polynomial of degree at most m such that Pm{L — c^I) ~ |-L — c^/|. 

3.2. Algorithm 3.1 with B — L. If B — L, Algorithm 3.1 can be regarded as 
a step of a standard two-grid method [8, 39] applied to the Poisson equation 

(3.11) Ly^r, 

modified by replacing the operator Lh by \Lh — c^Ih\ on the coarse grid. The question 
remains if the algorithm delivers a form of an approximate solve for absolute value 
problem (3.9), and hence is suitable for AV preconditioning of (3.2). To be able to 
answer this question, we analyze the propagation of the initial error Cg^ = \L — c'^I\~^r 
of (3.9) under the action of the algorithm. 

We start by relating errors of (3.9) and (3.11). 

Lemma 3.1. Given a vector w, consider errors e'^^{w) = \L — c^I\~^r — w and 
e^{w) — L^^r — w for (3.9) and (3.11), respectively. Then 

(3.12) e^V) - e'-iw) + {c^I - Wp)L-^\L - c^/j-V, 

where Wp = 21^|Ap|V^*, Vp is the matrix of eigenvectors of L ~ c^I corresponding to 
the p negative eigenvalues Ai < . . . < Ap < 0, and \Ap\ = diag{\Xi\, . . . , |Ap|}. 
Proof. Observe that for any w, 

e^^(u;) = \L- c^I\-^r - w = \L - c^I\-\ + (e^(w) - V) 
= e^'iw) + \L- c^I\-^L-\L - |i - c2/|)r. 

Denoting A~L — c^I, we use the expression jvll = A — 2V^ApV"p* to get (3.12) □ 
Algorithm 3.1 transforms the initial error eg = L^^r of equation (3.11) into 

(3.13) = S^KS'(el 

where Si ~ I — M^^L and S2 — I ~ M^* L are pre- and postsmoothing operators, 
K = I — P\Lh — c?Ih\^^RL corresponds to the coarse grid correction step, and 
e^ = L^^r — w^""*. Denoting the error of absolute value system (3.9) after applying 
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Algorithm 3.1 by e^^ = \L~ c^I\-^r -wP"'* and observing that = \L-c^I\L-^e^'', 
by (3.12)-(3.13) we obtain 

(3.14) e*^ = {S^KS^\L - c^I\ + c^I - Wp) L'^e^^. 

The last expression gives an explicit form of the desired error propagation operator, 
which we denote by G: 

(3.15) G = {S^KS'[\L - c^I\ + c^I - Wp) . 

Below, as a smoother, we use a simple Richardson's iteration, i.e.. Si = S2 = 
I — tL, where r is an iteration parameter. The restriction R is given by the full 
weighting and the prolongation P by the standard piecewise linear interpolation; 
see [8, 39]. 

At this point, in order to simplify further presentation, let us refer to the one- 
dimensional analogue (3.4) of model problem (3.1). In this case, the matrix L is 
tridiagonal: L = tridiag {-l/Zi^, 2//i2, -l//i2}. We assume that n, the number of 
interior grid nodes, is odd: h = i/{n + 1). The coarse grid is then obtained by 
dropping the odd-numbered nodes. We denote the size of the coarse grid problem by 
N = {n + l)/2 - 1; H = 1/{N + 1) = 2h. The tridiagonal matrix Lh denotes the 
discretization of the ID Laplacian on the coarse level. 

Recall that the eigenvalues of L are 9j — siv? with corresponding eigenvec- 
tors Vj = [sinZjTr/i]"^;^. Similarly, the eigenvalues of Lh are Of = sin^ ^-^^ 
and the coarse grid eigenvectors are denoted by vf = \J2H [sin Iji^H^^^. It is clear 
that operators L — (?I and ~ <?Ih have the same sets of eigenvectors as L and 
Lh with eigenvalues tj = 9j — and tf ~ 9f — , respectively. 

Let Cg^ = '^i'^^i be the expansion of the initial error in the eigenbasis of L. 

Since e*^ = Ggq^ = X]J=i '^iC^^i); '^^ interested in the action of the error 
propagation operator (3.15) on the eigenmodes Vj. We first consider the images of Vj 
under the action of the coarse grid correction operator K. 

The action of the full weighting operator on eigenvectors Vj is well known [8]: 

r S'l;/, j = l,...,N, 

(3.16) Rvj = < 0, j 



c?w,f+i-„ j = N + 2,...,n. 



Here, Cj = cos With Sj — sin the action of the piecewise linear interpolation 
is written as 

(3.17) Pvf = C^Vj - S^jVn+l^j , j = l,...,N, 

where Vn+i-j is the so-called complementary mode of Vj [8]. Combining (3.16) 
and (3.17) leads to the following expression for Kvj: 



(3.18) Kvj 



1 - S- j «J + «j S ' j = l,...,N, 
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Since Vj are the eigenvectors of 5i = 5*2 = / — tL, L — c^I, L ^ and Wp, (3.15) leads 
to explicit expressions for Gvj. 

Theorem 3.2. Let < On+i = 2//i2. Then the error propagation operator G 
in (3.15) acts on the eigenvectors Vj of ID Laplacian as follows: 



(3.19) 



Gvi 



(11) I (12) ■ 1 Ar 

gjVj, j = N + l, 

(21) , (22) ■ Ar , o 

^ g) '"3+9) Vn+1-3, j = N + 2,... 



where 



(3.20) 
(3.21) 
(3.22) 
(3.23) 
(3.24) 



.(11) 



J12) 



J21) 



A22) 




tf\J O3 0, 



5, = (l-r0,)-^ + 



= (l-r0,)- 1-4-^ 



(1 - rejYs]c]-^^{l - r0„+i_,) 



and 13 j = 



2{c' 
0, 



>c2 



Theorem 3.2 impHes that for relatively small shifts, Algorithm 3.1 with B = L 
and a proper choice of r and v reduces the error of (3.9) in the directions of almost 
all eigenvectors Vj. In a few directions, however, the error may be amplified. These 
directions are given by the smooth eigenmodes associated with 0j that are close to 
on the right, as well as with 9j that are distant from on the left. The number of the 
latter, if any, is small if eh is sufficiently small, and becomes larger as ch increases. 

Indeed, let r = so that \l-T0j\ <l for all j and |1 - tOjI < 1/3 for j > N. 

This choice of the parameter provides the least uniform bound for |1 — T6j\ that 
correspond to the oscillatory eigenmodes [34, p. 415]. It is then readily seen that (3.21) 
and (3.24) can be made arbitrarily small within a reasonably small number v of 
smoothing steps. Similarly, (3.22) and (3.23) can be made arbitrarily close to c^/6j < 
1. If << 0N+1, then c^/0j in (3.22) and (3.23) is close to zero. Thus, Theorem 3.2 
shows that for relatively small shifts, smoothing provides small values of (3.21)-(3.24) 
and, hence, damps of the oscillatory part of the error. Note that the damping occurs 
even though the smoothing is performed with respect to (3.11), not (3.9). 

Now let us consider (3.20). Theorem 3.2 shows that if is close to an eigenvalue 
of the coarse-level Laplacian, i.e., if tj^ « 0, then the corresponding reduction 
coefficient (3.20) can be large. This means that Algorithm 3.1 with B = L has a po- 
tential difficulty of amplifying the error in the directions of a few smooth eigenvectors. 
Similar effect is known to appear for standard MG methods applied to Helmholtz type 
problems; see [7, 13]. Below, we analyze (3.20) in more detail. 
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Let 9j > (? . Then, using the relation — CjOj, we can write (3.20) as 
.(11) _n _^fl.^2. (,_^2_ 1 \ <? 



1 

Here, it is easy to see that as (? jB^ — > 0, g'^^^ -> (1 — tQ^Y^ s^^ < 1/2, meaning that 
the smooth eigenmodes corresponding to 9j away from on the right are well damped. 
If 9j < c^ then (3.20) takes the form 

Since c| E (1/2, 1), for any /0j > 1, we can obtain the bound 

eye, - 2 ^ eye, - c] - c] ^ c^/e, - zj^ 
c-^/e, - 1 - c?/ej - d - c^/e^ - 1/2 ■ 



Additionally, 2,-^" < (1 - tO^Y" < 1. Thus, 



4(cV0,)-2 ' 

where Z^- = if 1 < c^/9j < 2, and 1^=2- c^/Oj if c^/Oj > 2. 

The inequality implies that Iffj^^^l < 1 for 1 < c^/9j < 3, i.e., the algorithm 
reduces the error in the directions of several smooth eigenvectors associated with 0j 
to the left of c^. At the same time, we note that as c^/Oj oo, g^^^^ — >■ oo, i.e., the 
smooth eigenmodes corresponding to 0j that are distant from on the left can be 
amplified. Clearly, if ch is sufficiently small then the number of such error components 
is not large (or none), and grows as ch increases. 

The above analysis shows that Algorithm 3.1 with B — L indeed represents a 
solve for (3.9), where the solution is approximated everywhere, possibly except for 
a subspace of a small dimension. In the context of preconditioning, this translates 
into the fact that the preconditioned matrix has spectrum clustered around 1 and — 1 
with a few outliers generated by the amplification of the smooth eigenmodes. If the 
shift is sufhciently small, the number of such outliers is not large, which only slightly 
delays the convergence of the outer PMINRES iterations and does not significantly 
affect the efficiency of the overall scheme. 

3.3. Algorithm 3.1 with B — Pm{L — c^I). The analysis of the previous sub- 
section suggests that the quality of Algorithm 3.1 with B = L may deteriorate as ch 
increases. This result is not surprising, since for larger ch the relation L w |L — c^/| 
becomes no longer meaningful. Below we introduce a different approach for approx- 
imating the fine grid absolute value. In particular, we consider constructing polyno- 
mial approximations B — Pm{L — c^I), where Pm(A) is a polynomial of degree at most 
TO > 0, such that PmiL — c^I) « |L — c^/|. 

Let us first refer to the ideal particular case, where pm{L — c^I) = \L — c^I\. This 
can happen, e.g., if Pm(A) is an interpolating polynomial of /(A) = |A| on the spectrum 
of L — c^I, m — n ~ 1. In such a situation, Algorithm 3.1 with B ~ Pm{L — c^I) 
results in the following transformation of the initial error: 

(3.25) e*^ = S^KS'ie^^, 
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where Si^I-M-^\L- c^I\ and S2 = I - A/" 



|L — c^/| are pre- and postsmoothing 
~ i^'J^Hl — c^I\ corresponds to the coarse grid 

correction step. The associated error propagation operator is further denoted by G, 



operators, and K ~ I ~ P\Lh — c^Ih] ^R\L 



(3.26) 



G = S!^KS'(. 



For the purpose of clarity, we again consider the ID counterpart (3.4) of the model 
problem. As a smoother, we choose Richardson's iteration with respect to absolute 
value system (3.9), i.e., Si — S2 ^ I — t\L — c^I\. It is important to note here that the 
eigenvalues \tj\ of the absolute value operator are, in general, no longer ascendingly 
ordered with respect to j as is the case for 6'j's and tj's. Moreover, in contrast to 
L and L — c^/, the top part of the spectrum of \L — may be associated with 
both smooth and oscillatory eigenmodes. In particular, this means that Richardson's 
iteration may fail to properly eliminate the oscillatory components of the error, which 
is an undesirable outcome of the smoothing procedure. To avoid this, we require that 
1^1 1 < ^Af+i- It is easy to verify that the latter condition is fulfilled if 



(3.27) 



ch < 1. 



Note that (3.27) automatically holds if discretization rule (3.3) is enforced. Repeating 
the above argument for the 2D case also leads to (3.27). 

Let the restriction and prolongation operators R and P be the same as in the 
previous subsection. Similar to (3.18), we obtain an explicit expression for the action 
of the coarse grid correction operator K on eigenvectors Vj: 



(3.28) Kvj 



M 



2 2 
S,C 



n+l-j\ 



'A. 



j = l,...,iV, 

j = iV + l, 

j = N + 2,...,n. 



The following theorem is the analogue of Theorem 3.2. 

Theorem 3.3. The error propagation operator G in (3.26) acts on the eigenvec- 
tors Vj of the ID Laplacian as follows: 



(3.29) 


Gv, = < 


f 9^^. + 
[ 9?% + 


-(12) 

9j V 

-(22) 
9) ^ 


where 








(3.30) 


-(11) 


- (1 - r\t, 


r (1 


(3.31) 


-(12) 


= (1 - T\t, 




(3.32) 




= (1 - T\t, 


r, 


(3.33) 


-(21) 

9 J 


= (1 - T\t, 




(3.34) 


-(22) 

9} 


= (1 - T\t, 





n+l-j, 



1,- 

N- 
N- 



„4 I'-Jl 
^3 \+H 



UH 



''n+l-j I 



1^.1 



Al-r\t 



n+l-j\ 



''ri+l-j I 
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We conclude from Theorem 3.3 that in the ideal case where p,„(L—c^/) = ji— c^/|, 
Algorithm 3.1 with B = Pm{L — c^I) and a proper choice of r and v reduces the error 
of system (3.9) in the directions of all eigenvectors Vj, possibly except for a few that 
correspond to 9j close to the shift (? . Unlike in the case of Algorithm 3.1 with B — L, 
as ch grows, no amplified error components appear in the directions of eigenvectors 
associated with 9j distant from on the left. This suggests that Algorithm 3.1 with 
B = Pm{L — c?I) = \L — c^/| provides a more accurate solve for (3.9) with larger ch. 

To see this, let us first assume that r = /i^/(3 — c^h?). Since (3.27) implies that 
\tj I — tj for j > N, this choice is known to give the smallest uniform bound on 1 1— r|tj 1 1 
corresponding to the oscillatory eigenmodes vj, which is |1 — r|tj|| < 1/(3 — c/i) < 1/2 
with the last inequality resulting from (3.27). Hence, coefficients (3.31)~(3.34) can be 
reduced within a reasonably small number v of smoothing steps. 

Next, we note that (3.30), which is not substantially affected by smoothing, can 
be large if is close to i.e., if tj^ 0. At the same time, we can write (3.30) as 

9^''^ - il ~ r\t,\r (l-S^ 

which shows that l^j^^'*! approaches (1 — r|tj|)^'^s| < 1/2 as increases, i.e., smooth 
error components associated with 9j away from are well damped. 

Thus, if used as a preconditioner. Algorithm 3.1 with B — pm{L — c^I) = |i — c^/| 
aims at clustering the spectrum of the preconditioned matrix around 1 and —1, with 
a few possible outliers that result from the amplification of the smooth eigenmodes 
associated with dj close to c^. Unlike in the case where B — L, the increase of ch 
does not additionally amplify the smooth error components distant from on the 
left. Therefore, Algorithm 3.1 with B = p„i{L — c^I) = |L — can be expected to 
provide a more accurate preconditioner for larger shifts. 

Although our analysis targets the ideal but barely feasible case where Pm{L — 
c^I) — \L — c^I\, it motivates the use of polynomial approximations Pm{L — c^I) w 
|L — c^/| and provides a theoretical insight into the superior behavior of such an option 
for larger ch. In the rest of this subsection we describe a method for constructing such 
polynomial approximations. Our approach is based on the finding that the problem 
is easily reduced to constructing polynomial filters. 

We start by introducing the step function 

[0, A < a; 

where a is a real number, and noting that sign(A) = 2ft,o(A) — 1, so that 
(3.35) \L - c^I\ = {2ha{L - c^I) - I) {L ~ c^l) . 

Here /io(^ — c^I) — ^^o(A)y*, where V is the matrix of eigenvectors of L — c^I and 
/io(A) = diag{0, . . . , 0, 1, . . . , 1} is obtained by applying the step function ft.o(A) to the 
diagonal entries of the matrix A of the associated eigenvalues. Clearly the number of 
zeros on the diagonal of hf){K) equals the number of negative eigenvalues of L — 

Let qm-i{^) be a polynomial of degree at most (to — 1), such that g,„_i(A) 
approximates /io(A) on the interval [a, 6], where a and h are the lower and upper bounds 
on the spectrum of L — c^I , respectively. In order to construct an approximation 



c^s^. 
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p,n{L — of \L — c?I\, we replace the step function hQ{L — c^I) in (3.35) by the 
polynomial g„i_i(L — c^/). Thus, 

(3.36) \L - c^I\ « praiL ~ c^I) = (2g„_i(L - c^I) - l) {L - c^l) . 

The matrix L — c^I is readily available on the fine grid. Therefore, we have 
reduced the problem of evaluating the polynomial approximation pm of the abso- 
lute value operator to constructing a polynomial Qm-i that approximates the step 
function Hq. More specifically, since Algorithm 3.1 can be implemented without the 
explicit knowledge of the matrix B, i.e., B can be accessed only through its action 
on a vector, we need to construct approximations of the form qm-i{L — c^T)v to 
/io(i — c?I)v, where w is a given vector. 

The task of constructing qm-i{L — c^I)v « ho{L — c^I)v represents an instance of 
polynomial filtering, which is well known; see, e.g., [11, 32, 45]. In this context, due to 
the property of filtering out certain undesirable eigencomponents, the step function 
ho is called a filter function. The approximating polynomial qm-i is referred to as a 
polynomial filter. 

State-of-the-art polynomial filtering techniques such as [32] would first replace 
the discontinuous step function /io(-^) by a smooth approximation on [a,b] and then 
approximate the latter by a polynomial in the least-squares sense. In this paper, we 
follow a simpler approach based on the direct approximation of ho{X) using Chebyshev 
polynomials [30, 31]. The constructed polynomial Qm-i allows defining qm-i{L — 
(?I)v « ho{L — c^I)v and hence Pm{L — c?I)v \L — (?I\v. Below we describe the 
entire procedure. 

Let 

<-) ^^(^)-^ 

be a linear change of variable which maps A e [a, 6] to ^ e [—1, 1]. Then note that 

b + a 



ho{\) = ha{^), a = - 



b — a 



Let Ti(^) = cos(i arccos(^)) be the Chebyshev polynomials of the first kind, ^ G 
[—1,1]. Then the Chebyshev least-squares approximation qm-i{^) of the step function 
ha{£,) is of the form 

d 

(3.38) g„,_i(C)= 5^7*7^.(0, ^£[-1,1], 

i=0 

where the expansion coefficients are given by 
1 1 

haiCmiOd^, i = 0,...,m-l; 



r 1 

with m — T^{^)d^. Calculating the above integrals gives the exact ex- 



pressions for 7^, 



(3.39) 



— (arccos(a)) , i = 0, 

TT 



2 /sin(i arccos(a)) 
TT \ i 



i > 1. 
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Since /io(A) = /iq(^), we define the polynomial approximation (7m-i(A) to /io(A) 
as q,n-i{X) ^ where ^ is given by (3.37), a = -{b + a) /{b - a), and 9,„_i(C) 

is constructed according to (3.38), (3.39). Thus, by (3.36), we get 

m— 1 

\L - ^I\v « p™(L - ^I)v = 2 ^ lrT,{C)t -t, t={L- ^I)v, 

i=Q 

where C = (L — c^l) — — /. Since the Chebyshev polynomials are generated 

b — a b — a 

using the three-term recurrent relation [30, 31] 

T,(e) = 2er,_i(0 - r,_2(0, T^ii) = T^iO - 1, * - 2,3, ... , 

the procedure for constructing the desired vectors w — Pm{L — cP'I)v w |i — c^I\v can 
be summarized by the following algorithm. 

Algorithm 3.2 (Polynomial approximation of the absolute value). Input: v, 
TO, [a,b] D A{L — c^I). Output: w = Pm{L — I)v \L — I\v . 

1. Set {L- c^I)v. 

2. Set C ^ (L-c^l) - l^I. Set -{b + a)/(b - a) . 

b — a b ~ a 

3. Set Wo ■'^ V, vi-<r- Cv, wi ^ 70^0 + 71^1. Throughout, compute 7^ by (3.39). 

4. For i = 2, . . . , TO — 1 do 

5. Vi 2Cwi_i - v^_2 

6. Wi-i+-fiVi 
1. EndFor 

8. Return w 2wm-\ — v. 

Algorithm 3.2 provides means to replace a matrix-vector product with the un- 
available \L — c^I\ by a procedure that involves a few multiplications with the shifted 
Laplacian L — c^I. As we further show, the degree m of the approximating polynomial 
can be kept reasonably low. Moreover, in the MG framework discussed in the next 
subsection, the algorithm has to be invoked only on sufficiently coarse grids. 

3.4. The MG AV preconditioner. Now let us consider a hierarchy of s + 1 
grids numbered by ^ = s, s — 1, . . . , with the corresponding mesh sizes {hi\ in 
decreasing order [hs — h corresponds to the finest grid, and /lo to the coarsest). For 
each level I we define the discretization Li — c^Ii of the differential operator in (3.1), 
where Li is the Laplacian on grid I, and /; is the identity of the same size. 

In order to extend the two-grid AV preconditioner given by Algorithm 3.1 to the 
multigrid, instead of inverting the absolute value \Lh — c^Ih\ in (3.6), we recursively 
apply the algorithm to the restricted vector R{r — Bw^^'^). This pattern is then 
followed in the V-cycle fashion on all levels, with the inversion of the absolute value 
of the shifted Laplacian on the coarsest grid. The matrix B on level / is denoted by 
Bi. Each Bi is assumed to be SPD and is expected to approximate \Li — c^/;|. 

In the previous subsections we have considered two choices of B for the two-grid 
preconditioner in Algorithm 3.1. In the MG framework, these choices give Bi = Li 
and Bi — pmi {Li — c^Ii), where pmi is a polynomial of degree at most to; on level I. 

The advantage of the first option, Bi = Li, is that it can be easily constructed 
and the application of Bi to a vector is inexpensive even if the size of the operator 
is very large. According to our analysis for the ID model problem in subsection 3.2, 
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the approach is suitable for c/i; sufficiently small. Typically this is a case for / corre- 
sponding to finer grids. However, c/i; increases with every new level. This may result 
in the deterioration of accuracy of the overall MG preconditioning scheme, unless the 
size of the coarsest level is kept sufficiently large. 

The situation is different for the second option Bi = p„j, (L/ — c^Ii). In this case, 
applications of Bi may be expensive on finer grids because they require a sequence 
of matrix-vector multiplications with large shifted Laplacian operators. However, on 
coarser levels, i.e., for larger chi, this is not restrictive because the involved oper- 
ators are significantly decreased in size compared to the finest level. Additionally, 
as suggested by the analysis in subsection 3.3, if Pm, (L; — c^Ii) represent reasonable 
approximations of \Li — c^Ii \ on levels I, one can expect a higher accuracy of the whole 
preconditioning scheme compared to the choice Bi = Li. 

Our idea is to combine the two options. Let 6 G (0, 1) be a "switching" parameter, 
where for finer grids chi < 6. We choose 



(3.40) Bi 



Li, chi < S, 

p„n{Li - c^Ii), chi > S. 



The polynomials p„i, (L; — c^Ii) are accessed through their action on a vector and are 
constructed using Algorithm 3.2 with L — c^ I = Li — c^ Ii and m = mi. 

Summarizing our discussion, if started from the finest grid I = s, the following 
scheme gives the multilevel extension of the two-grid AV preconditioner defined by 
Algorithm 3.1. The subscript I is introduced to match quantities to the corresponding 
grid. We assume that the parameters 6, nii, vi, and the smoothers Mi are pre-specified. 

Algorithm 3.3 (AV-MG(r(): the MG AV preconditioner) . Inputri. Outputwi. 

1. Set Bi by (340). 

2. Presmoothing. Apply i>i smoothing steps, vi > I: 

(3.41) = ^ + M-\ri - Biwf^), i = 0, . . . , z// - 1, wf^ = 0, 

where Mi defines a smoother on level I. Set w^^'^ — 

3. Coarse grid correction. Restrict (Ri^i) ri — Biw^'^ to the grid I— 1, recursively 
apply AV-MG, and prolongate (Pi) back to the fine grid. This delivers the 
coarse grid correction added to w^'^ '■ 



(3.42) wi-i 



^F-MG(i?,_i(n-B,<")), />1; 



(3.43) «;[^^ = «;r +P,^i_i. 
4-. Postsmoothing. Apply vi smoothing steps: 

(3.44) wf+^^ = wf'^ + M-* (n - Biwl''>), i = 0, . . . , - 1, ^ = w";^', 

where Mi and vi are the same as in step 2. Return wi — wf°'^^ = w\'^'\ 
The described MG AV preconditioner implicitly constructs a mapping denoted 
by r I— > w = Tmgr, where the operator T = Tmg has the following structure: 

(3.45) T„<, = (/ - M'* bY PT^^;^^ R (/ - BM'^Y + F, 
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with F as in (3.10) and ri"g defined according to the recursion 

.3 4gx tI^I - {li~Mr*BiY' PiT,\l-''>Ri^,{li~BiM^'Y' +Fi, 

Ti^l = |io-c2/oP\ / = l,...,s-l, 

where Fi - B^^ - {h - M^BiY' B^^ {h - BiM^^Y' . 

The structure of the multilevel preconditioner T — T^g in (3.45) is the same as 
that of the two-grid preconditioner T = Ttg in (3.10), with \Lh — c^Ih \ ^ replaced by 

the recursively defined operator T^~'^^ in (3.46). Thus, the symmetry and positive 
definiteness of T = Tmg follows from the same property of the two-grid operator 
through relations (3.46), provided that Pi = aiR^_^ and the spectral radii of // — 
Mf^B; and /; - Mf*B, are less than 1 throughout the coarser levels. We remark 
that preconditioner (3.45)-(3.46) is non- variable, i.e., it preserves the global optimality 
of PMINRES. 

The simplest possible approach for computing wq in (3.42) is to explicitly con- 
struct \Lo — c'^Io\~^ through the full eigendecomposition of the coarse- level Laplacian, 
and then apply it to i?o(^i ~ BiwY'^)- An alternative approach is to determine wq as 
a solution of the linear system (Lq — c?Io + 2Vb|Ao|Fo*)'"'o = -Ro('"i — BiwY'^), where 
Vb is the matrix of eigenvectors associated with the negative eigenvalues of Lq — c^/q 
contained in the corresponding diagonal matrix Aq. In the latter case, the full eigen- 
decomposition of Lq is replaced by the partial eigendecomposition targeting negative 
eigenpairs, followed by a linear solve. To solve the linear system approximately, 
additional gains may be obtained by applying the Woodbury formula [19, p. 50] to the 
matrix (Lp — c^/q -|- 2Vo|Ao| V^*)"^, provided that systems Cz — r can be efficiently 
solved for some C ^ Lq — c^Iq. In this work, we rely only on exact coarse grid solves. 

Since we use Richardson's iteration with respect to Pmi{Li — c^Ii) as a smoother 
on coarser grids, as motivated by the discussion in subsection 3.3, the guidance for 
the choice of the coarsest grid is given by condition (3.27). More specifically, in the 
context of the standard coarsening procedure (/ij-i — 2hi), we select hierarchies of 
grids satisfying chi < 1 for ^ = s, . . . , 1, and c/iq > 1. As shown in the next section, 
even for reasonably large c^, the coarsest-level problems are small. 

The parameter 6 in (3.40) should be chosen to ensure the balance between com- 
putational costs and the quality of the MG preconditioner. In particular, if S is 
reasonably large then the choices of Bi are dominated by the option Bi — Li, 
which is inexpensive but may not be suitable for larger shifts on coarser levels. 
On the other extreme, if 6 is close to zero then the common choice corresponds to 
Bi = Pmi {Li — (?Ii), which provides a better preconditioning accuracy for larger shifts 
but may be too computationally intense on finer levels. In our numerical experiments, 
we keep 5 e [1/3,3/4]. 

As we demonstrate in the next section, the degrees m/ of the occurring polynomi- 
als Pmi should not be large, i.e., only a few matrix- vector multiplications with Li — c?Ii 
are required to obtain satisfactory approximations of absolute value operators. For 
properly chosen 5, these additional multiplications need to be performed on grids that 
are significantly coarser than the finest grid, i.e., the involved matrices Li — c?Ii are 
orders of magnitude smaller than the original fine grid operator. As confirmed by 
our numerical experiments, the overhead caused by the polynomial approximations 
appears to be marginal and does not affect much the computational cost of the overall 
preconditioning scheme. 
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4. Numerical experiments. This section presents a numerical study of the 
MG preconditioner in Algorithm 3.3. Our goal here is twofold. On the one hand, the 
reported numerical experiments serve as a proof of concept of the AV preconditioning 
described in Section 2. In particular, we show that the AV preconditioners can be 
constructed at essentially the same cost as the standard preconditioning methods (MG 
in our case). On the other hand, we demonstrate that the MG AV preconditioner 
in Algorithm 3.3 combined with the optimal PMINRES iteration, in fact, leads to 
an efficient and economical computational scheme, further called MINRES-AV-MG, 
which outperforms several known competitive approaches for the model problem. 

Let us briefly describe the alternative preconditioners used for our comparisons. 
Throughout, we use matlab for our numerical examples. 

The inverted Laplacian preconditioner. This strategy, introduced in [2], 
is a representative of an SPD preconditioning for model problem (3.2), where the 
preconditioner is applied through solving systems Lw = r, i.e., T — L^^. As has 
been previously discussed, for relatively small shifts c^, the Laplacian L constitutes 
a good SPD approximation of \L — In this sense, the choice T ~ L^^ perfectly 
fits, as a special case, into the general concept of the AV preconditioning presented in 
Section 2. We refer to PMINRES with T = L"! as MINRES-Laplace. 

Usually, one wants to solve the system Lw — r only approximately, i.e., use 
T ss L~^. This can be efficiently done, e.g., by applying the V-cycle of a standard MG 
method [8, 39]. In our tests, however, we perform the exact solves using the matlab's 
"backslash", so that the reported results reflect the best possible convergence with 
the inverted Laplacian type preconditioning. 

The indefinite MG preconditioner. We consider a standard V-cycle for prob- 
lem (3.2). Formally, it can be obtained from Algorithm 3.3 by setting Bi = Li — c^Ii 
on all levels and replacing the first equality in (3.42) by the linear solve with Lq — c^Iq. 
The resulting MG scheme is used as a preconditioner for restarted GMRES and for 
Bi-CGSTAB. We refer to these methods as GMRES(fc)-MG and Bi-CGSTAB-MG, 
respectively; k denotes the restart parameter. A thorough discussion of the indefinite 
MG preconditioning for Helmholtz problems can be found, e.g., in [13]. 

Table 4.1 

The largest problem sizes satisfying chi > 5 for different values of the shift (? , "switching" 
parameters 5, and the standard coarsening scheme hi_i = 2/i;. The last row (5 = 1) corresponds to 
the sizes of the coarsest problems for different c^ . 





c2 = 300 


c2 = 400 


c2 = 1500 


c2 = 3000 


c2 = 4000 


5 = 1/3 


961 


961 


3969 


16129 


16129 


5 = 1/2 


961 


961 


3969 


3969 


3969 


5 = 3/4 


225 


225 


961 


3969 


3969 


.5=1 


225 


225 


961 


961 


961 



In our tests, we consider 2D model problem (3.2) corresponding to (3.1) discretized 
on the grid of size h = 2^^ (the fine problem size n — 65025). The exact solution x* 
and the initial guess xo are randomly chosen. The right-hand side 6 = (i — c^I)x* , 
which allows evaluating the actual errors along the steps of an iterative method. All 
the occurring MG preconditioners are built upon the standard coarsening scheme (i.e., 
hi-i = 2hi), restriction is based on the full weighting, and prolongation on piecewise 
multilinear interpolation [8, 39]. 
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Let us recall that Algorithm 3.3 requires setting a parameter 6 to switch between 
Bi = L and p„i, (L—c^I) on different levels; see (3.40). Assuming standard coarsening, 
Table 4.1 presents the largest problem sizes corresponding to the condition chi > S for 
a few values of 6 and c^. In other words, given 6 and c^, each cell of Table 4.1 contains 
the largest problem size for which the polynomial approximation of \Li — c^Ii\ is 
constructed. Unless otherwise explicitly stated, we set 5 = 1/3. Note that according to 
the discussion in subsection 3.4 (condition (3.27)), the row of Table 4.1 corresponding 
to (5 = 1 delivers the sizes no of the coarsest problems for different shift values. 

Table 4.1 shows that the coarsest problems remain relatively small even for large 
shifts. The polynomial approximations are constructed for coarser problems of sig- 
nificantly reduced dimensions, which in practical applications are negligibly small 
compared to the original problem size. 

As a smoother on all levels of Algorithm 3.3 we use Richardson's iteration, i.e., 
Af;^^ = Till. On the finer levels, where Bi = L/, we choose r; = hf/h and vi = 1. On 
the coarser levels, where Bi = {Li — c^h), we set t; = /if /(5 — c^h'f) and vi — 5. 

Similar to the ID case considered in subsections 3.2 and 3.3, both choices of 
T; correspond to the optimal smoothing of oscillatory eigenmodes with respect to 
the 2D operators Li and \Li — c^Ii], respectively [8, 39]. Since Pmi{Li — c^Ii) only 
approximates \Li — c^Ii \ in practice, the choice of ti on the coarser grids is likely to be 
not optimal. Therefore, for chi > (5, we have increased the number of smoothing steps 
to 5. In all tests, the degrees mi of polynomials are set to 10. The intervals containing 
A(i; — c^Ii), required for evaluating are [— c^,8/ij~^ — c^]. The inverted coarse 
grid absolute value |Lo — c^/o|^^ is constructed by the full eigendecomposition. 




Fig. 4.1. Comparison of several preconditioned schemes; n = 65025. 
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In Figure 4.1, we compare MINRES-AV-MG with the above introduced alter- 
native preconditioned schemes for the model problem. Each plot corresponds to a 
different shift value. The restart parameter k varies for all runs of GMRES(fc)-MG, 
increasing (left to right and top to bottom) as grows from 300 to 3000. In our 
tests, the size uq of the coarsest problem in Algorithm 3.3 is 225 (Figure 4.1, top) and 
961 (Figure 4.1, bottom); see Table 4.1 with (5 = 1. The same no is used for the MG 
preconditioner in the corresponding runs of GMRES(fc)-MG and Bi-CGSTAB-MG. 

Figure 4.1 shows that MINRES-AV-MG noticeably outperforms PMINRES with 
the inverted Laplacian preconditioner. For smaller shifts (c^ = 300, 400), MINRES- 
AV-MG is comparable, in terms of the iteration count, to GMRES(fc)-MG and Bi- 
CGSTAB-MG; fc = 5, 10. For larger shifts (c^ = 1500, 3000), however, MINRES- 
AV-MG provides a superior convergence behavior. In particular, the scheme exhibits 
faster convergence than GMRES(A;)-MG under less demanding storage requirements, 
while Bi-CGSTAB-MG fails to converge (fc = 50, 150). 

If the polynomial approximations in Algorithm 3.3 appear only on sufficiently 
coarse grids and the size no of the coarsest problem is relatively small, then the addi- 
tional costs introduced by the coarser grid computations of the AV MG preconditioner 
are negligible relative to the cost of operations on the finer grids, which are the same 
as in the standard V-cycle for the indefinite problem. This means that the complexity 
of Algorithm 3.3 is similar to that of the MG preconditioner in GMRES(fc)-MG and 
Bi-CGSTAB-MG. 

To be more precise, in the tests reported in Figure 4.1 (bottom), a single ap- 
plication of Algorithm 3.3 has required 15-20% more time than the indefinite MG 
preconditioner, even though the polynomial approximations in Algorithm 3.3 have 
been constructed for problem sizes as large as 16129 if — 3000, which is of the same 
order as the original problem size n. For larger problem sizes, the time difference 
becomes negligible. For example ii h = (n = 261121), Algorithm 3.3 results in 
only 5% time increase, and the relative time difference becomes indistinguishable for 
smaller h. The application of all the MG preconditioners in Figure 4.1 (top) required 
essentially the same time. 

The above discussion suggests that our numerical comparison, based on the num- 
ber of iterations, is representative. Additionally, in Table 4.2 we provide the time 
comparisons for the MG preconditioned schemes. In particular, we measure the ac- 
tual time required by the runs of MINRES-AV-MG (i^y), GMRES(fc)-MG (ic), and 
Bi-CGSTAB-MG (is) in Figure 4.1, and report the speed-ups. 



Table 4.2 

Time comparison of the MG preconditioned schemes in Figure 4-1 ■ 





c2 = 300 


c2 = 400 


c2 = 1500 


c2 = 3000 


ts/tAV 


1.1 


1.1 






tc/tAV 


1.4 


1.3 


2.6 


1.9 



We have observed that the performance of GMRES(fc)-MG can be improved by 
increasing the restart parameter. In Figure 4.1, however, the values k have been 
chosen to balance between storage and convergence behavior. In particular, we set k 
to be sufficiently small, so that the storage required for GMRES(fc)-MG is as close as 
possible to that of MINRES-AV-MG, while the convergence of the method is not lost. 
Since Bi-CGSTAB-MG is based on a short-term recurrence, its storage expenses are 
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similar to that MINRES-AV-MG. 

The unsatisfactory performance of GMRES(fc)-MG in Figure 4.1 can, in part, 
be attributed to the observation that smoothing based on Richardson's (or, more 
generally, Jacobi) iteration becomes increasingly unstable as grids coarsen. In par- 
ticular, as shown in [13], on the intermediate levels with chi > 1/2 this smoothing 
scheme strongly amplifies the smooth error eigenmodes. A straightforward remedy is 
to invoke the coarse grid solve on the largest grid that fails to satisfy chi < 1/2. 

Table 4.3 

Number of iterations of MINRES-AV-MG (MINR) and GMRES(k)-MG (GMR(k)) required to 
reduce the initial error by 10~*; n = 65025. The preconditioner in GMRES(k)-MG uses Richard- 
son's smoothing on levels chi < 1/2 and invokes the coarse grid solve on the level that follows. 
Numbers in the parentheses correspond to the right preconditioned GMRES(k)-MG. Dash denotes 
that the method failed to converge within 1000 steps. 





MINR 


GMR(5) 


GMR(IO) 


GMR(20) 


GMR(25) 


GMR(35) 


= 1500 


89 


29(44) 


20(22) 


16(18) 


16(18) 


16(18) 


c2 = 3000 


282 


-(-) 


-(-) 


-(-) 


223(269) 


69(69) 


c2 = 4000 


310 


-(-) 


-(-) 


-{-) 


-(-) 


395(471) 



In Table 4.3, we compare MINRES-AV-MG and GMRES(yfc)-MG with different 
values of the restart parameter. We report the iteration counts required to reduce 
the initial error by 10"** for systems with = 1500, 3000, and 4000. The indefinite 
MG preconditioner in GMRES(fc)-MG is configured to run Richardson's smoothing 
on grids chi < 1/2 and perform the coarse grid solve on the level that follows. We 
test the indefinite MG preconditioner in the left (used so far) and right preconditioned 
versions of GMRES(fe). 

The above described setting of the right preconditioned GMRES(fc)-MG repre- 
sents a special case of the Helmholtz solver introduced in [13]. In this paper, instead 
of the "early" coarse grid solve, the MG preconditioning scheme performs further 
coarsening on levels chi > 1/2 with GMRES used as a smoother. Since the resulting 
preconditioner is nonlinear, (the full) FGMRES [33] is used for outer iterations. 

It is clear that the results in Table 4.3 provide an insight into the best possi- 
ble outer iteration counts that can be expected from the restarted version of the 
method in [13]. In fact, the same observation is used by the authors of [13] for hand- 
tuning the smoothing schedule of their preconditioner to study its best-case perfor- 
mance. Table 4.3 then demonstrates that, regardless of the smoothing schedule and 
the smoother's stopping criterion, the (F)GMRES based solvers, such as, e.g., the one 
in [13], require increasing storage to maintain the convergence as grows, whereas 
the robustness of MINRES-AV-MG is not lost under the minimalist memory costs. 

Thus, if the shift is large and the amount of storage is limited, so that k is forced 
to be sufficiently small, the (F)GMRES(fc) outer iterations may fail to converge within 
a reasonable number of steps, even if the coarse grid solve in the MG preconditioner 
is performed "early" . We note, however, that if storage is available or the shifts are 
not too large, the (F) GMRES based methods may represent a valid option. 

We have also tested the Bunch-Parlett factorization [18] as a coarse grid solve 
in the MG framework. In particular, as a preconditioner for MINRES, we have used 
Algorithm 3.3 with Bi = L; on all levels and the coarsest-grid absolute value in (3.42) 
replaced by the application of the "perfect" Bunch-Parlett factorization based pre- 
conditioner [18]. We have obtained results that are inferior to the schemes considered 
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in this paper for shifts not too small, e.g., for > 200 if = 2 The unsatisfac- 
tory behavior may be related to the fact that the inverted Laplacian T = L^^ and 
the ideal absolute value T = |L — c^/| preconditioners share the same eigenvectors 
with A = L — c^I, while the preconditioner from [18] does not. 

The standard MG preconditioners, as in GMRES(A:)-MG and Bi-CGSTAB-MG, 
are known to have optimal costs, linearly proportional to n. As discussed above, the 
same is true for the AV preconditioner in Algorithm 3.3. Therefore, if, in addition, the 
number of iterations in the iterative solver preconditioned with Algorithm 3.3 does 
not depend on the problem size, the overall scheme is optimal. 

Table 4.4 

Mesh-independent convergence of PMINRES with the MG AV preconditioner. The numbers in 
parentheses correspond S = 3/4. The default value of S is 1/3. 





h = 2-6 


h = 2-'' 


h = 2-^ 


/i = 2-9 


/1-2-10 


/i = 2-ii 


c2 = 300 


31(31) 


31(31) 


30(32) 


30(32) 


30(32) 


30(30) 


c2 = 400 


37(40) 


38(40) 


37(40) 


37(40) 


37(40) 


37(39) 


c2 = 1500 


67(97) 


97(119) 


89(109) 


88(108) 


89(106) 


90(107) 


c2 = 3000 


228(229) 


222(284) 


279(332) 


256(298) 


257(296) 


256(298) 



We verify this optimality in Table 4.4, which shows the mesh-independence of the 
convergence of PMINRES with the MG AV preconditioner. The rows of the table 
correspond to the shift values c^, while the columns match the mesh size h. The cell 
in the intersection contains the numbers of steps performed to achieve the decrease 
by the factor 10~^ in the error 2-norm with the choices of the "switching" parameter 
5 = 1/3 and 5 = 3/4. 

As previously, the size of the coarsest grid has been set according to Table 4.1 
with 6=1. We conclude that the convergence does not slowdown with the decrease 
of h; thus, PMINRES preconditioned by Algorithm 3.3 is optimal. Note that for 
larger shifts, = 1500 and = 3000, mesh-independent convergence occurs for h 
sufficiently small, when the "switching" pattern is stabilized, i.e., Bi = L; on a few 
finer grids and Bi = p,„j (L/ — c^Ii) on the coarser grids that follow. 

Table 4.4 shows that as grows, the increase in the iteration count is mild and 
essentially linear. As expected, the smaller value of S, which leads to the construction 
of the polynomial approximations earlier on finer levels, results in a higher accuracy 
of the AV preconditioner. 

Finally, in Figure 4.2 we plot the eigenvalues of T |L — c^/| and T{L — c?I) for 
different shift values using the logarithmic scale; n = 16129. As suggested by Corol- 
lary 2.5, clusters of eigenvalues of T\L — c^l\ are preserved in the spectrum of the 
preconditioned matrix T{L — Almost all eigenvalues of T{L — (?I) are clustered 
around —1 and 1, with only a few falling outside of the clusters. We note that the 
clustering and the condition number k(T |^|) deteriorate as increases from 300 to 
3000, which is compatible with the results in Table 4.4. 

The spectra computed in Figure 4.2 allow validating numerically the tightness of 
bounds (2.4) in Theorem 2.3 for the MG AV preconditioner. In Table 4.5, we report 
the number of eigenvalues \j of T{L — (?I) that satisfy either the upper or the lower 
bound up to machine precision. The table shows that the bound is numerically sharp. 



ABSOLUTE VALUE PRECONDITIONING 



23 





3000 r 








1500< 


> 




Shift 


400 
300 



Eigenvalues of TIL - 11 and T(L - I) for different shifts 
" i3000r °-" '•' " 1 3000 1 — 



00 amBO o 



A(T|L-c2||) 




-10° -10-^ 



10° 10^ 



A(T(L - c" l))<0 



A(T(L - c" l))>0 



Fig. 4.2. Spectrum ofT\L — c^l\ (left), negative eigenvalues ofT(^L — c^l) (center), and 
positive eigenvalues of T (^L — c^l) (right); n = 16129. 

Table 4.5 

Number of eigenvalues Xj that equal the upper/lower bound in (2.4) up to machine precision. 





c2 = 300 


c2 = 400 


c2 = 1500 


c2 = 3000 


Upper 





15 


1 


115 


Lower 








10 






5. Conclusions. We propose a new approach for SPD preconditioning for sym- 
metric indefinite systems, based on the idea of implicitly constructing approximations 
to the inverse of the system matrix absolute value. A multigrid example of such a 
preconditioner is presented, for a real-valued Helmholtz problem. Our experiments 
demonstrate that PMINRES with the new MG absolute value preconditioner leads to 
an efficient iterative scheme, which has modest memory requirements and outperforms 
traditional GMRES based methods if available memory is tight. 

Acknowledgments. The authors thank Michele Benzi, Yvan Notay, and Joe Pas- 
ciak for their comments on the draft of the manuscript. 
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